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1.  Executive  Summary 

Building  a  robust  spoken  dialogue  system  for  a  new  application,  task,  or  domain  currently  re¬ 
quires  considerable  effort,  including  substantial  efforts  in  data  collection,  building  language 
models,  grammar/parser  development,  building  a  custom  dialogue  manager,  and  developing  the 
connection  to  the  system’s  "back-end"  systems  (e.g.,  a  database  query  or  knowledge  based  sys¬ 
tem).  This  project  developed  key  parts  of  a  technology  base  upon  which  spoken  dialogue  sys¬ 
tems  can  be  rapidly  constructed  for  new  domains.  Our  approach  involves  building  generic  com¬ 
ponents  (i.e.,  ones  that  apply  in  any  practical  domain)  for  all  stages  of  spoken  dialogue  under¬ 
standing,  and  developing  techniques  for  rapidly  customizing  the  generic  components  to  new  do¬ 
mains.  To  achieve  this  goal  we  made  progress  in  several  important  areas:  (1)  developing  a  ge¬ 
neric  domain-independent  grammar  of  spoken  English  together  with  techniques  for  optimizing 
parser  performance  for  specific  domains,  (2)  a  domain  independent  representation  of  semantic 
meaning  with  an  ontology  mapping  framework  that  allows  the  user  to  define  relatively  simple 
mapping  rules  to  the  domain-specific  communication/representation  language,  and  (3)  a  domain- 
general  collaborative  problem  solving  framework  that  enables  rapid  construction  of  the  dialogue 
agents,  and  provides  the  link  to  domain-specific  reasoning  capabilities. 

During  this  project,  we  used  the  generic  technology  developed  to  enable  the  construction  of  a 
dialogue-based  task  learning  system  called  PLOW.  A  paper  based  on  this  system  won  the  out¬ 
standing  paper  award  at  the  annual  conference  of  the  Association  for  the  Advancement  of  Artifi¬ 
cial  Intelligence  (AAAI)  in  2007  (Allen  et  al,  2007).  A  core  component  of  that  system  is  a 
domain-general  deep  language  understanding  system.  A  key  accomplishment  in  this  effort  was 
developing  techniques  to  enable  broad-coverage  deep  understanding  by  taking  advantage  of 
many  recent  developments  in  statistical  techniques  and  corpora.  Typically  used  only  for  shallow 
understanding.  Our  preliminary  experiments  on  parsing  previously  unseen  text  indicates  great 
promise  for  the  work  (Allen  et  al,  2008). 

In  the  remainder  of  this  report,  we  describe  these  accomplishments  in  more  detail. 

2.  Broad-Coverage  Deep  Natural  Language  Understanding 

Deep  language  understanding  involves  mapping  language  to  expressions  capturing  its  intended 
meaning,  in  terms  of  concepts  and  relations  in  an  ontology  that  supports  reasoning.  Deep  under¬ 
standing  is  needed  in  many  applications,  including  dialogue-based  human -computer  interfaces  to 
intelligent  systems/agents,  tutoring  and  advice-giving  systems,  systems  that  learn  from  instruc¬ 
tion,  and  systems  that  learn  from  reading. 

There  seems  to  be  a  consensus  in  the  field  that  broad-coverage,  high-accuracy  deep  parsing  is 
currently  not  feasible.  We  do  not  believe  this  is  the  case  and  discuss  here  the  core  generic  tech- 
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Figure  I:  The  LF  graph  for  “The  three  small  dogs  frequently  eat  bones ” 


nologies  vve  developed  for  deep  semantic  processing  of  natural  language.  In  order  to  attain  high- 
accuracy  broad-coverage  deep  processing,  vve  augmented  the  core  system  with  statistical  proc¬ 
essing  to  aid  in  disambiguation,  and  large-scale  lexical  resources  to  extend  the  lexicon.  In  this 
way,  the  deep  understanding  system  can  be  guided  by  a  wide  range  of  advice  derived  from  statis¬ 
tical  language  processing,  including  named-entity  recognizers,  statistical  parsers,  word  sense  dis¬ 
ambiguation  techniques  and  semantic  role  identification,  plus  a  large  base  of  shallow  generic 
knowledge.  In  other  words,  the  deep  parser  provides  the  framework  to  integrate  all  the  results 
from  a  diverse  range  of  statistical  models  into  a  consistent  deep  logical  form.  Initial  experiments 
suggest  that  this  approach  has  great  promise. 

The  Logical  Form 

The  logical  form  (LF)  is  the  semantic  representation  language  produced  by  the  parser.  It  is  de¬ 
signed  to  be  an  expressive,  yet  intuitive,  formalism  for  expressing  sentence  logical  form.  In  de¬ 
signing  the  LF,  we  had  multiple  considerations:  (1)  it  needs  to  be  expressive,  providing  good 
coverage  of  the  complex  semantic  phenomena  in  language,  including  modal  operators,  general¬ 
ized  quantifiers,  and  underspecified  scoping  constraints  (cf  MRS  (Copestake  et  al.,  2005));  (2)  it 
needs  to  be  fully  indexed  into  the  word  senses  in  the  semantic  ontology,  as  opposed  to  using  un¬ 
interpreted  predicates  found  in  many  logical  forms;  (3)  it  needs  to  support  robust  processing  of 
sentence  fragments  that  are  common  in  speech;  (4)  it  needs  to  support  the  implementation  of 
ontology-mapping  rules;  and  (5)  it  needs  to  be  understandable  to  humans  -  readability  of  formal¬ 
isms  is  critical  for  debugging  and  analysis. 

We  define  the  LF  in  its  graphical  form.  Besides  being  more  intuitive,  the  graphical  form  allows 
interesting  comparisons  to  approaches  that  produce  partial  semantic  analyses,  such  as  statistical 
word-sense  and  semantic  role  disambiguation  techniques.  In  addition,  the  graphical  formalism 
leads  to  easier  formal  analysis.  Consider  the  LF  for  the  sentence  The  three  small  dogs  frequently 
eat  bones  shown  in  Figure  1 .  There  are  many  types  of  objects  evoked  by  this  sentence,  captured 
by  the  nodes  in  the  graph.  First,  there  is  the  event  of  the  dogs  liking  bones,  where  the  node  cap¬ 
tures  a  reified  event  in  a  Davidsonian-style  (Davidson,  1967)  representation.  Next  we  have  prop¬ 
erties  like  small,  which  are  reified  in  the  same  way.  The  interpretation  of  the  three  small  dogs  re¬ 
quires  several  nodes,  including  a  set  of  size  three,  consisting  of  dogs  that  are  small  (rather  than 
the  set  being  small).  Furthermore,  as  a  definite  description,  we  expect  to  be  able  to  identify  the 
set  of  dogs  from  the  discourse  context.  Finally,  we  need  to  capture  that  bones  refers  to  a  kind  of 


object  rather  than,  say,  a  specific  set  of  bones.  Note  that  each  node  indicates  a  specifier  (indicat¬ 
ing  the  type  of  node,  be  it  a  generalized  quantifier,  event  identifier,  or  kind)  as  well  as  the  type  of 
the  object.  This  is  critical  for  subsequent  discourse  processing.  Nodes  are  connected  by  arcs  that 
indicate  argument  relations  (semantic  roles  from  the  LF  ontology)  and  dependency  relationships 
(critical  for  resolving  the  unscoped  LF  into  a  fully  scoped  formal  representation).  There  are  two 
types  of  arcs:  those  connecting  to  terms  and  those  connecting  to  predicate/formulas.  The  distinc¬ 
tion  between  them  is  important  for  the  quantifier  scoping  algorithm. 

The  LF  formalism  has  additional  features  to  capture  aspects  such  as  coreference  relations,  im¬ 
plicit  arguments  to  predicates,  complex  quantification  (e.g.,  almost  all  dogs,  every  other  dog,  all 
but  one),  modals,  tense,  aspect,  negation,  complex  adverbials,  numbers,  time  and  date  expres¬ 
sions,  and  other  complicated  phenomena. 

The  Core  Parsing  Technology:  The  grammar  is  a  lexicalized  context-free  grammar,  augmented 
with  feature  structures  and  feature  unification.  The  grammar  is  motivated  from  X-bar  theory,  and 
draws  on  principles  from  GPSG  (e.g.,  head  and  foot  features)  and  HPSG.  While  it  has  a  context- 
free  backbone,  the  parsing  is  best  seen  as  a  search  through  possible  logical  forms.  The  search  in 
the  parser  is  pruned  by  domain-general  selectional  restrictions  from  the  ontology  to  eliminate 
semantically  anomalous  sense  combinations  during  parsing.  The  parser  builds  constituent/logical 
forms  bottom-up  using  a  best-first  search  strategy  similar  to  A*,  combining  pre-specified  rule  and 
lexical  weights  and  the  influences  of  the  statistical  techniques  described  below.  The  search  ter¬ 
minates  when  a  pre-specified  number  of  spanning  constituents  have  been  found  or  a  pre-specified 
maximum  chart  size  is  reached.  The  chart  is  then  searched  using  a  dynamic  programming  algo¬ 
rithm  to  find  the  least  cost  sequence  of  constituent/logical  forms  according  to  a  scoring  function 
that  can  be  varied  by  genre  being  processed. 

The  current  lexicon  contains  approximately  7,000  hand  built  lexical  lemmas  (with  morphological 
variants,  yielding  17000  words),  each  identified  with  a  semantic  concept  in  the  LF  ontology  that 
specifies  the  selectional  restrictions  on  its  possible  arguments  and  modifiers. 

The  Broad  Coverage  system:  To  attain  broader  coverage,  we  used  input  from  a  variety  of  ex¬ 
ternal  resources.  We  built  a  subsystem  for  unknown  word  lookup  that  accesses  lexical  resources 
such  as  Wordnet  (Miller,  1995)  and  Comlex  (Macleod  et  al.,  1994).  The  WordNet  senses  are 
mapped  to  the  LF  ontology  at  an  abstract  level  and  the  combined  resource  information  is  used  to 
build  lexical  entries  with  approximate  semantic  and  syntactic  structures  for  words  not  in  the  core 
lexicon.  Because  the  information  in  such  entries  is  underspecified,  the  parser  must  deal  with  sig¬ 
nificantly  increased  levels  of  ambiguity  when  dealing  with  such  words. 

Because  it  was  developed  for  speech  applications,  the  parser  is  designed  to  accept  word  lattices 
as  input  so  speech  recognizers  can  pre-populate  the  chart  with  different  word  hypotheses,  letting 
the  parser  choose  among  them  based  on  what  entries  make  the  best  overall  interpretations.  We 
use  the  same  mechanism  for  integrating  a  corpora-based  preprocessors  such  as  a  named  entity 
recognizer,  which  adds  hypotheses  to  the  input  chart  about  possible  named  entities.  Note  these 
are  hypotheses-the  parser  does  not  have  to  use  them.  As  with  word  hypotheses  from  a  speech 
recognizer,  the  parser  will  choose  the  input  hypotheses  that  lead  to  the  best  overall  interpretation. 
In  addition,  we  can  use  statistical  part-of-speech  and  word  sense  disambiguation  techniques  to 
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Figure  2:  Extending  and  Guiding  Deep  Parsing 

suggest  likely  interpretations  of  words  in  the  input  chart.  Using  techniques  similar  to  Swift  et  al. 
(2004)  and  Cahill  et  al.  (2007),  the  extended  system  also  receives  constituent  structure  advice 
from  a  state-of-the-art  statistical  parser.  For  the  results  reported  here,  we  used  the  out-of-the-box 
unlexicalized  stochastic  context-free  grammar  parser  from  Stanford  (Klein  and  Manning,  2003). 
Again,  these  are  preferences  that  help  guide  parsing,  but  do  not  limit  the  range  of  possible  overall 
interpretations.  The  system  with  these  extensions  is  shown  in  Figure  2.  The  parts  with  dotted  out¬ 
lines  are  under  development  and  not  included  in  the  current  evaluations 

Evaluation 

We  performed  an  evaluation  of  the  coverage  and  accuracy  of  the  extended  parser  on  seven  para¬ 
graphs  (Text  1-7)  submitted  by  seven  different  research  groups  for  a  common  evaluation  at  the 
workshop  on  the  semantics  of  text  processing  (Bos  and  Delmonte,  2008).  Below  is  a  sample 
paragraph,  Text  #6,  which  proved  the  most  challenging  for  our  system: 

Amid  the  tightly  packed  row  houses  of  North  Philadelphia ,  a  pioneering  urban  farm  is  providing 
fresh  local  food  for  a  community  that  often  lacks  it,  and  making  money  in  the  process.  Grecnsgrow ; 
a  one-acre  plot  of  raised  beds  and  greenhouses  on  the  site  of  a  former  steel-galvanizing  factory,  is 
turning  a  profit  by  selling  its  own  vegetables  and  herbs  as  well  as  a  range  of  produce  from  local 
growers,  and  by  running  a  nursery  selling  plants  and  seedlings.  The  farm  earned  about  $ 10,000  on 
revenue  of  $450,000  in  2007,  and  hopes  to  make  a  profit  of  5  percent  on  $650,000  in  revenue  in 
this,  its  10th  year,  so  it  can  open  another  operation  elsewhere  in  Philadelphia. 

We  defined  precision  and  recall  measures  on  the  LF.  Given  a  gold-standard  LF-graph,  we  can 
evaluate  the  LF  graph  produced  by  a  system  by  defining  node  and  edge  scoring  criteria  and  then 
computing  the  node  alignment  that  maximizes  the  overall  score.  The  evaluation  metric  between  a 
gold  LF  graph  G  and  a  test  LF  graph  T  is  then  defined  as  the  maximum  score  produced  by  any 
node/edge  alignment  from  the  gold  to  the  test  LF. 

We  parsed  the  seven  texts  to  obtain  the  LF-graphs  for  each.  Then  we  took  each  paragraph  and 
hand-built  a  gold-standard  LF-graph  for  each.  Using  the  precision  and  recall  measures  discussed 
above,  the  base  parser  attained  61.4%  precision  and  67.2%  recall  on  the  unseen  data. 
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We  then  performed  a  limited  amount  of  lexical  and 
grammatical  development  based  on  the  evaluation: 
adding  26  new  lexical  items  (17  nouns,  1  verb,  7 
adjectives  and  1  adverb),  33  new  or  modified 
senses  for  existing  lexical  items,  7  new  ontology 
concepts,  and  two  grammar  rules.  Word  sense 
modifications  included  adding  a  new  argument 
structure  pattern  to  a  lexical  entry  and/or  a  new  semantic  role  to  an  existing  concept.  For  exam¬ 
ple,  in  some  cases  an  ontology  concept  included  an  agent  role,  but  not  one  for  a  more  general 
cause  role.  We  did  not  attempt  to  add  all  unknown  words  and  senses;  Aside  from  the  proper 
nouns,  there  are  still  14  unknown  words  (e.g.,  merchandising,  propellant,  nitrocellulose)  remain¬ 
ing  in  the  texts  for  which  we  derive  entries  for  analysis  from  unknown  word  lookup. 

After  this  development,  we  estimated  the  potential  of  the  extended  system  by  rerunning  the  base 
parser  and  then  several  combinations  of  the  extensions.  By  adding  named-entity  recognition,  un¬ 
known  word  lookup,  and  part  of  speech  tagging  advice,  performance  rises  to  78.2%  precision 
and  82.6%  recall.  Adding  advice  on  constituent  bracketing  using  the  Stanford  parser  gave  only  a 
slight  improvement.  These  results  are  summarized  in  Table  1 .  Because  almost  all  prior  work  has 
not  attempted  an  evaluation  of  deep  understanding,  there  is  little  prior  work  to  compare  to.  How¬ 
ever,  just  on  the  face  of  the  scores,  we  think  we  have  made  a  convincing  case  that  domain- 
independent,  broad-coverage,  deep  understanding  of  language  is  a  technology  within  reach. 

3.  Using  A  Collaborative  Problem  Solving  Agent  for  One-shot  Task  Learning 

We  developed  the  generic  collaborative  problem  solving  model  using  several  different  applica¬ 
tions  as  test  cases.  The  most  significant  system  is  one  that  focuses  on  agents  that  can  acquire  the 
task  models  they  need  from  intuitive  language-rich  demonstrations  by  humans.  These  agents  use 
the  same  collaborative  architecture  to  learn  tasks  as  they  do  to  perform  tasks.  The  system  dis¬ 
plays  an  integrated  intelligence  that  results  from  sophisticated  natural  language  understanding, 
reasoning,  learning,  and  acting  capabilities  unified  within  a  collaborative  agent  architecture. 

Background  on  Task  Learning 

In  previous  work,  researchers  have  attempted  to  learn  new  tasks  by  observation,  creating  agents 
that  learn  through  observing  an  expert’s  demonstration  (Angros  et  al.  2002,  Lau  &  Weld  1999; 
Lent  &  Laird  2001).  These  techniques  require  observing  multiple  examples  of  the  same  task,  and 
the  number  of  training  examples  required  increases  dramatically  with  the  complexity  of  the  task. 
To  be  effective,  however,  collaborative  assistants  need  to  be  able  to  acquire  tasks  much  more 
quickly  -  typically  from  a  single  example,  possibly  with  some  clarification  dialogue.  To  enable 
this,  in  our  system  the  teacher  not  only  demonstrates  the  task,  but  also  gives  a  “play-by-play” 
description  of  what  they  are  doing.  This  is  a  natural  method  that  people  already  use  when  teach¬ 
ing  other  people,  and  our  system  exploits  this  natural  capability.  By  combining  the  information 
from  understanding  with  prior  knowledge  and  a  concrete  example  demonstrated  by  the  user,  our 
system  (called  PLOW)  can  learn  complex  tasks  involving  iterative  loops  in  a  single  short  training 
session. 


Initial 

Baseline 

System 

Baseline 
System 
after  devel 

wlNER , 
POS ,  and 
UKW  lookup 

wl  constituent 
advice  from 
Stanford 
Parser 

Prec. 

61.40% 

74.4% 

78.2% 

79.0% 

Recall 

67.20% 

74.4% 

82.6% 

82.8% 

Table  1:  Evaluation  on  combined  texts 


5 


PLOW  learns  tasks  that  can  be  performed  within  a  web  browser.  These  are  typically  information 
management  tasks,  e.g.,  finding  appropriate  sources,  retrieving  information,  filing  requisitions, 
booking  flights,  and  purchasing  things,  Figure  3  shows  the  user  interface  as  it  was  used  in  the 
evaluation.  The  main  window  on  the  left  is  simply  the  Mozilla  browser,  instrumented  so  that 
PLOW  can  monitor  user  actions.  On  the  right  is  the  procedure  that  PLOW  has  learned  so  far, 
summarized  back  in  language  from  the  task  model  using  PLOW’S  language  generation  capabili¬ 
ties.  Across  the  bottom  is  a  chat  window  that  shows  the  most  recent  interactions.  The  user  can 
switch  between  speech  and  keyboard  throughout  the  interaction. 

The  Agent  Architecture 


The  understanding  components  combine  natural  language  (speech  or  keyboard)  with  the  ob¬ 
served  user  actions  on  the  GUI.  After  full  parsing,  semantic  interpretation  and  discourse  interpre¬ 
tation  produce  plausible  intended  actions.  These  are  passed  to  the  collaborative  problem  solving 
(CPS)  agent,  which  settles  on  the  most  likely  intended  interpretation  given  then  current  problem 
solving  context.  Depending  on  the  actions,  the  CPS  agent  then  drives  other  parts  of  the  system. 
For  example,  if  the  recognized  user  action  was  to  demonstrate  the  next  step  in  the  task,  the  CPS 
agent  invokes  the  task  learning,  which  if  successful  will  update  the  task  models  in  the  knowledge 
base.  If,  on  the  other  hand,  the  recognized  user  intent  was  to  request  the  execution  of  a  (sub)task, 
the  CPS  agent  attempts  to  look  up  a  task  that  can  accomplish  this  action  in  the  knowledge  base. 
It  then  invokes  the  execution  system  to  perform  the  task.  During  collaborative  learning,  the  sys¬ 
tem  may  actually  do  both  -  it  may  learn  a  new  step  in  the  task  being  learned,  but  because  it  al¬ 
ready  knows  how  to  do  the  subtask,  it  performs  it  for  the  user.  This  type  of  collaborative  execu¬ 
tion  while  learning  is  critical  in  enabling  the  learning  of  iterative  steps  without  requiring  the  user 
to  tediously  demonstrate  each  loop  through  the  iteration. 
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While  we  have  shown  examples  of  how  integrating  language,  dialogue,  reasoning  and  learning 
has  great  potential  for  effective  one-shot  task  learning,  the  real  test  is  whether  ordinary  users  can 
quickly  learn  to  use  the 
system  to  teach  new  pro¬ 
cedures.  There  are  many 
possible  pitfalls:  (1)  do 
we  have  comprehensive 
enough  natural  language 
understanding  capabili¬ 
ties  so  that  users  express¬ 
ing  information  in  intui¬ 
tive  ways  are  likely  to  be 
understood?  (2)  can  we 
really  learn  robust  task 
models  from  a  single  ex¬ 
ample,  (3)  can  the  users 
easily  determine  whether 
the  system  is  learning 
correctly  as  they  are 
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Figure  3:  The  PLOW  Interface 
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teaching  the  system. 

In  August  2006,  we  delivered  a  version  of  the  PLOW  system  to  independently  contracted  evalu¬ 
ators.  At  that  point,  we  had  developed  the  system  to  ensure  that  we  (the  developers)  could  effec¬ 
tively  teach  PLOW  to  learn  how  to  answer  seventeen  pre-determined  test  question  templates. 
The  evaluators  recruited  1 6  test  subjects  who  received  general  training  on  how  to  use  PLOW  and 
many  other  applications  that  were  part  of  the  overall  project  evaluation.  Among  these  were  three 
other  task  learning  systems:  one  learns  entirely  from  passive  observation,  one  used  a  sophisti¬ 
cated  GUI  primarily  designed  for  editing  procedures  but  extended  to  allow  the  definition  of  new 
procedures,  and  the  third  used  an  NL-like  query  and  specification  language  that  required  users  to 
have  a  detailed  knowledge  of  HTML  producing  the  web  pages. 

After  training,  the  subjects  then  performed  the  first  part  of  the  test,  in  which  they  had  to  use  dif¬ 
ferent  systems  to  teach  some  subset  of  the  predefined  test  questions.  Seven  of  these  involved  the 
PLOW  system.  Once  the  procedures  were  learned  by  the  systems,  the  evaluators  created  a  set  of 
new  test  examples  by  specifying  values  for  the  input  parameters  to  the  task  and  then  scored  the 
results  from  executing  the  learned  task  models  using  predetermined  scoring  metrics  individual¬ 
ized  to  each  question.  The  PLOW  system  did  well  on  this  test,  scoring  2.82  out  of  4  across  all 
test  questions  and  the  16  subjects. 

The  second  part  of  the  test  involved  a  set  of  10  new  “surprise”  test  questions  not  previously  seen 
by  any  of  the  developers  (see  Figure  1).  Some  of  these  were  close  variants  to  the  original  test 
questions,  and  some  were  entirely  new  tasks.  The  sixteen  subjects  had  one  work  day  to  teach 
whichever  of  these  surprise  tasks  they  wished,  using  whichever  of  the  task  learning  systems  they 
wished.  As  a  result,  this  test  reveals  not  only  the  core  capability  for  learning  new  tasks,  but  also 
evaluates  the  usability  of  the  four  task  learning  systems. 

PLOW  did  very  well  on  this  test  on  all  measures.  Out  of  the  16  users,  thirteen  of  them  used 
PLOW  to  teach  at  least  one  question.  Of  the  other  systems,  the  next  most  used  system  was  used 
by  eight  users.  If  we  look  at  the  total  number  of  tasks  successfully  taught,  we  see  that  PLOW  was 
used  to  teach  30  out  of  the  55  task  models  that  were  constructed  during  the  day.  Furthermore,  the 
tasks  constructed  using  PLOW  received  the  highest  average  score  in  the  testing  (2.2  out  of  4). 

Concluding  Remarks 

This  project  has  developed  significant  generic  technology  for  dialogue  systems  that  is  reusable 
across  domains.  We  have  developed  and  demonstrated  the  potential  of  broad-covergae  deep  lan¬ 
guage  understanding,  and  developed  a  generic  collaborative  problems  solving  architecture  that 
can  enable  sophisticated  mixed-intiative  dialogue,.  As  demonstrated  in  the  PLOW  system. 
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